Reactions � MAS962 Goldstone

Greg Detre

Tuesday, October 01, 2002

 

Reading � Goldstone & Rogosky (2002)

Abstract

According to an �external grounding� theory of meaning, a concept�s meaning depends on its connection to the external world. By a �conceptual web� account, a concept�s meaning depends on its relations to other concepts within the same system. We explore one aspect of meaning, the identification of matching concepts across systems (e.g. people, theories, or cultures). We present a computational algorithm called ABSURDIST (Aligning Between Systems Using Relations Derived Inside Systems for Translation) that uses only within-system similarity relations to find between-system translations. While illustrating the sufficiency of a conceptual web account for translating between systems, simulations of ABSURDIST also indicate powerful synergistic interactions between intrinsic, within-system information and extrinsic information.

Reading

Goldstone & Rogosky make two important points. The first is that it does make sense to talk of two people sharing the same concept, even when that concept is defined slightly differently for them, or bears different relations to the other concepts they hold. They provide a computational algorithm (ABSURDIST) which shows how �concepts�, defined solely in terms of pairwise �distance� measures from each other, can be matched across systems, without being based on extrinsic information at all.

Secondly, they show that their algorithm works even better when extrinsic information is incorporated. By �extrinsic information�, they include any information that is not captured purely by within-system relations. This is intended to demonstrate that theories of meaning based on external grounding (some causal connection to the real world, often mediated by perceptual mechanisms) need not be incompatible with the conceptual web accounts vindicated by their algorithm.

 

 

 

 

Reactions � Goldstone & Rogosky

Fodor�s such a monkey:

�to say that two people share a concept (i.e. that they have literally the same concept) is thus to say that they have tokens of literally the same concept type�

(Fodor, p. 28, cited in article)

there are no tokens!

 

Discarded

 

Questions

external information vs extrinsic information???

is this kind of analogous to Google�s pagerank???

see questions scribbled in margins of printout

does the system work better with more nodes???

�the algorithm�s ability to recover true correspondences generally increases as a function of the number of elements in each system, at least for small levels of noise�

but presumably would require more iterations�

could you use some sort of simulated annealing approach???

isn�t that what L (the learning rate) would allow you to do???

could you call this a connectionist system???

would the system work better with different types of nodes (e.g. �is-a�, �part-of� etc.)???

can you tweak the �cut-off� point of whether or not to consider a subset (e.g. of two items) a match???

does it actually make sense to talk of a cut-off point??? is it trying to find a global match???

how similar is this to the stereopsis problem???

actually, I think it�s pretty damn similar � in the stereopsis problem, you�re looking for one solution, but then when you think about it, you don�t know for sure where the object boundaries/discontinuities are (at least according to Marr�s assumption of low-level bottom-up only), so you�re going to have to choose between multiple possible solutions that place those boundaries in different places, and although you want to match as much of possible, you have to accept that sometimes there won�t actually be a perfect global match

is it able to see high-level matches, e.g. between groups???

I guess I just don�t fully understand what it�s doing/matching L

if it could group correspondences, you�d see analogies/correspondences that work at different(/multiple???) levels emerging, wouldn�t you???

in order for it to group correspondences, wouldn�t it have to do more than pairwise relations???

how would you ground it??? how would you create a grounded sample dataset???

is the dimensionality of the (Minkowskian) space separate from the number of nodes???

hmmm � I don�t think the question is correctly-worded, but:

�If the two dimensions reflect size and brightness, for example, then for q and x to have similar coordinates would mean that they have similar physical appearances along these perceptual dimensions.�

if you could get it to work with sparse inter-connections, and you could get it to work with subsets, then you�d start to move towards a really interesting boot-strapping algorithm for seeing analogies at multiple/higher levels

presumably, if you used a statistical procedure to derive vector representations of various words in two large corporaof text in different languages, you would be able to use absurdist to create a sort of inter-language dictionary on the fly

are there other/existing easier ways to do this???

unfortunately, there�s no way I can think of to employ an absurdist model in (say) my dual-agent vocabulary model because it requires an objective mind-reader to look inside and compare the representations, right???

yes, but it might be a really good tool for (the programmer) analysing how close the formed representations are!

e.g. how close the two agents� representations of the same domain/thing/word are, or even how close a single agent�s spatial/temporal representation is, or how much it changes over time

again, same question as before � without the absurdist algorithm, how would I have done this???

if the agents� facial expressions were to systematically reveal some subtle internal features of their representations of whatever was going on in their environment, might that not give enough information for absurdist to at least partially latch on to (even a tiny bit of intrinsic grounding added to an extrinsically grounded system might help)???

consider how much information we do have: facial expressions, intonation, body language, subtle diction choices etc.

perhaps part of the problem with current language games is that the communication protocols are too impoverished�

how would I scale this???

presumably, it becomes much (exponentially???) more complex as it grows, which is why they�re using such smaller systems

�the particular algorithm presented converges relatively quickly on a cross-system translation, and the convergence time does not depend much on the size of systems being aligned. The number of nodes does increase quadratically with the number of elements per system, but this can be reduced by only building correspondence units for alignments that have initial support above a threshold level (Goldstone, 1998), or by using dynamic binding operations to represent correspondences (Hummel & Holyoak, 1997).�

I still can�t figure out what adding different types of links (besides just a similarity-distance metric) would add

hmmm, where does this sit with the Gardenfors conceptual space model???

presumably, it adds a similarity measure�

wasn�t that one of the things that he needed???

y, Gardenfors� similarity metric was a simple Cartesian distance one, but this is much more interesting

absolute coordinate values vs relative/distance values??? meaningless???

algorithm � so, let me get this straight:

you have n concept units in a system

each concept unit is linked to its twin by a correspondence unit

that correspondence unit has a value/activation that�s analagous to a connection/synapse weight

the net input to a concept unit shows how strongly it�s activated

presumably this is to show that the net inputs to externally grounded twin units should be pretty similar, right???

in the case of the intrinsic-only experiment, is the net input zero, or simply the same for all units???

they had two ways of simulating external grounding � what were they???

was one to introduce different twinned net inputs???

you seed one of the C cells as 1

and the other to have a non-negative extrinsic coefficient in the formula

what was that formula and how did it work???

what�s the third coefficient/term???

arghhhhhhhhhhhhh. I think that when they talk about net input, they�re talking about net input to the correspondence unit

presumably, this net input is the product of the two concept units that that particular correspondence unit happens to compare

the algorithm is presumably trying to find the set of correspondence units whose sum is highest

product???

fuckfuckfuck. is this just a simple hebbian learning rule???

δwij = kfifj

having said that, the idea of input to the concept units might be interesting � partly as a way to simulate the same external grounding (i.e. they�re both currently looking at the same object) � and partly/similarly as a way of simulating particularly situations in which they find themselves, with some concepts being more heavily activated at one time than another.

arghhhhhhhhhh again. the net input is the formula with the three coefficients, defined just below

at the moment there is a right answer, because there is a real world to which the twin concept units in the systems respond similarly. are there any interesting ways to depart from this???

what would happen if the correspondence units ranged from �1 to 1???

would that allow for opposites (i.e. implicitly define a binary type distinction)???

no, because we�d be talking about an inter-agent opposite-relation, rather than an intra-agent relation

after all, what would it mean to say that your concept is the opposite of mine???

I feel like this is different from saying that we have the same concept, but an opposite view/value assigned to it (i.e. we happen to disagree)� hmmm???

for starters, you�d have to re-jig your algorithm, wouldn�t you???

hmmm. maybe not

at the very least, you�d have to shift the parameters of your squashing function, which shouldn�t be too hard

the minkowskian parameter (r) can range from 1 to infinity, no matter how many dimensions (n) the space.

as r infinity, the emphasis on the single highest parameter increases, until that�s all that matters.

does it become binary???

what�s the difference between r = 1 and r = 2??? presumably, just a bit closer to r = infinity

how revolutionary/exciting is the Absurdist algorithm???

does it belong to a wider class of algorithms, e.g. constrain-satisfaction, analogy-makers, the Minkowskian distance metric etc.???

when you�re generating a random dataset for A (and B), does it matter how big the distances are??? surely not

this is fucking hopeless � I still don�t understand why what the absurdist algorithm is doing is different from just computing the pairwise Minkowski distances for all the concepts and choosing the set which minimises the total distance. well???

ok, so it can deal with comparing systems with differing numbers of concepts, but not with systems where the distances between points is only sparsely specified, right???